26 research outputs found

    A dynamic framework based on local Zernike Moment and motion history image for facial expression recognition

    Get PDF
    A dynamic descriptor facilitates robust recognition of facial expressions in video sequences. The current two main approaches to the recognition are basic emotion recognition and recognition based on facial action coding system (FACS) action units. In this paper we focus on basic emotion recognition and propose a spatio-temporal feature based on local Zernike moment in the spatial domain using motion change frequency. We also design a dynamic feature comprising motion history image and entropy. To recognise a facial expression, a weighting strategy based on the latter feature and sub-division of the image frame is applied to the former to enhance the dynamic information of facial expression, and followed by the application of the classical support vector machine. Experiments on the CK+ and MMI datasets using leave-one-out cross validation scheme demonstrate that the integrated framework achieves a better performance than using individual descriptor separately. Compared with six state-of-arts methods, the proposed framework demonstrates a superior performance

    A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences

    Get PDF
    Facial expression causes different parts of the facial region to change over time and thus dynamic descriptors are inherently more suitable than static descriptors for recognising facial expressions. In this paper, we extend the spatial pyramid histogram of gradients to spatio-temporal domain to give 3-dimensional facial features and integrate them with dense optical flow to give a spatio-temporal descriptor which extracts both the spatial and dynamic motion information of facial expressions. A multi-class support vector machine based classifier with one-to-one strategy is used to recognise facial expressions. Experiments on the CK+ and MMI datasets using leave-one-out cross validation scheme demonstrate that the integrated framework achieves a better performance than using individual descriptor separately. Compared with six state of the art methods, the proposed framework demonstrates a superior performance

    Robust contactless pulse transit time estimation based on signal quality metric

    Get PDF
    The pulse transit time (PTT) can provide valuable insight into cardiovascular health, specifically regarding arterial stiffness and blood pressure. Traditionally, PTT is derived by calculating the time difference between two photoplethysmography (PPG) measurements, which require a set of body-worn sensors attached to the skin. Recently, remote photoplethysmography (rPPG) has been proposed as a contactless monitoring alternative. The main problem with rPPG based PTT estimation is that motion artifacts affect the shape of waveform leading to the shift or over-detected peaks, which decreases the accuracy of PTT. To overcome this problem, this paper presents a robust pulse-by-pulse PTT estimation framework using a signal quality metric. By exploiting the local temporal information and global periodic characteristics, the metric automatically assesses pulse quality of signal on a pulse-by-pulse basis, and calculates the probabilities of the pulse peak being the actual peak. Furthermore, in order to cope with over-detected and shift pulse peaks, Kalman filter complemented by the proposed signal quality metric is used to adaptively adjust the peaks based on the estimated probability. All the refined peaks are finally used for pulse-by-pulse PTT estimation. The experiment results are promising, suggesting that the proposed framework provides a robust and more accurate PTT estimation in real applications

    Fusing dynamic deep learned features and handcrafted features for facial expression recognition

    Get PDF
    The automated recognition of facial expressions has been actively researched due to its wide-ranging applications. The recent advances in deep learning have improved the performance facial expression recognition (FER) methods. In this paper, we propose a framework that combines discriminative features learned using convolutional neural networks and handcrafted features that include shape- and appearance-based features to further improve the robustness and accuracy of FER. In addition, texture information is extracted from facial patches to enhance the discriminative power of the extracted textures. By encoding shape, appearance, and deep dynamic information, the proposed framework provides high performance and outperforms state-of-the-art FER methods on the CK+ dataset

    Spatio-temporal framework on facial expression recognition.

    Get PDF
    This thesis presents an investigation into two topics that are important in facial expression recognition: how to employ the dynamic information from facial expression image sequences and how to efficiently extract context and other relevant information of different facial regions. This involves the development of spatio-temporal frameworks for recognising facial expression. The thesis proposed three novel frameworks for recognising facial expression. The first framework uses sparse representation to extract features from patches of a face to improve the recognition performance, where part-based methods which are robust to image alignment are applied. In addition, the use of sparse representation reduces the dimensionality of features, and improves the semantic meaning and represents a face image more efficiently. Since a facial expression involves a dynamic process, and the process contains information that describes a facial expression more effectively, it is important to capture such dynamic information so as to recognise facial expressions over the entire video sequence. Thus, the second framework uses two types of dynamic information to enhance the recognition: a novel spatio-temporal descriptor based on PHOG (pyramid histogram of gradient) to represent changes in facial shape, and dense optical flow to estimate the movement (displacement) of facial landmarks. The framework views an image sequence as a spatio-temporal volume, and uses temporal information to represent the dynamic movement of facial landmarks associated with a facial expression. Specifically, spatial based descriptor representing spatial local shape is extended to spatio-temporal domain to capture the changes in local shape of facial sub-regions in the temporal dimension to give 3D facial component sub-regions of forehead, mouth, eyebrow and nose. The descriptor of optical flow is also employed to extract the information of temporal. The fusion of these two descriptors enhance the dynamic information and achieves better performance than the individual descriptors. The third framework also focuses on analysing the dynamics of facial expression sequences to represent spatial-temporal dynamic information (i.e., velocity). Two types of features are generated: a spatio-temporal shape representation to enhance the local spatial and dynamic information, and a dynamic appearance representation. In addition, an entropy-based method is introduced to provide spatial relationship of different parts of a face by computing the entropy value of different sub-regions of a face

    A Segmentation-Guided Deep Learning Framework for Leaf Counting

    Get PDF
    Deep learning-based methods have recently provided a means to rapidly and effectively extract various plant traits due to their powerful ability to depict a plant image across a variety of species and growth conditions. In this study, we focus on dealing with two fundamental tasks in plant phenotyping, i.e., plant segmentation and leaf counting, and propose a two-steam deep learning framework for segmenting plants and counting leaves with various size and shape from two-dimensional plant images. In the first stream, a multi-scale segmentation model using spatial pyramid is developed to extract leaves with different size and shape, where the fine-grained details of leaves are captured using deep feature extractor. In the second stream, a regression counting model is proposed to estimate the number of leaves without any pre-detection, where an auxiliary binary mask from segmentation stream is introduced to enhance the counting performance by effectively alleviating the influence of complex background. Extensive pot experiments are conducted CVPPP 2017 Leaf Counting Challenge dataset, which contains images of Arabidopsis and tobacco plants. The experimental results demonstrate that the proposed framework achieves a promising performance both in plant segmentation and leaf counting, providing a reference for the automatic analysis of plant phenotypes

    Discriminative attention-augmented feature learning for facial expression recognition in the wild

    Get PDF
    Facial expression recognition (FER) in-the-wild is challenging due to unconstraint settings such as varying head poses, illumination, and occlusions. In addition, the performance of a FER system significantly degrades due to large intra-class variation and inter-class similarity of facial expressions in real-world scenarios. To mitigate these problems, we propose a novel approach, Discriminative Attention-augmented Feature Learning Convolution Neural Network (DAF-CNN), which learns discriminative expression-related representations for FER. Firstly, we develop a 3D attention mechanism for feature refinement which selectively focuses on attentive channel entries and salient spatial regions of a convolution neural network feature map. Moreover, a deep metric loss termed Triplet-Center (TC) loss is incorporated to further enhance the discriminative power of the deeply-learned features with an expression-similarity constraint. It simultaneously minimizes intra-class distance and maximizes inter-class distance to learn both compact and separate features. Extensive experiments have been conducted on two representative facial expression datasets (FER-2013 and SFEW 2.0) to demonstrate that DAF-CNN effectively captures discriminative feature representations and achieves competitive or even superior FER performance compared to state-of-the-art FER methods

    TCDNet : tree crown detection from UAV optical images using uncertainty-aware one-stage network

    Get PDF
    Tree crown detection plays a vital role in forestry management, resource statistics and yields forecasting. RGB high-resolution aerial images have emerged as a cost-effective source of data for tree crown detection. To address the challenges in the detection using UAV optical images, we propose a one-stage object detection network, TCDNet. First, the network provides an attention enhancement feature extraction module to enable the model to distinguish between tree crowns and their complex backgrounds. Second, an efficient loss is introduced to enable it to be aware of the overlap between adjacent trees, thus effectively avoiding misdetection. The experimental results on two publicly available datasets show that the proposed network outperforms state-of-art networks in terms of precision, recall and mean average precision

    Semi-supervised learning for forest fire segmentation using UAV imagery

    Get PDF
    Unmanned aerial vehicles (UAVs) are an efficient tool for monitoring forest fire due to its advantages, e.g., cost-saving, lightweight, flexible, etc. Semantic segmentation can provide a model aircraft to rapidly and accurately determine the location of a forest fire. However, training a semantic segmentation model requires a large number of labeled images, which is labor-intensive and time-consuming to generate. To address the lack of labeled images, we propose, in this paper, a semi-supervised learning-based segmentation network, SemiFSNet. By taking into account the unique characteristics of UAV-acquired imagery of forest fire, the proposed method first uses occlusion-aware data augmentation for labeled data to increase the robustness of the trained model. In SemiFSNet, a dynamic encoder network replaces the ordinary convolution with dynamic convolution, thus enabling the learned feature to better represent the fire feature with varying size and shape. To mitigate the impact of complex scene background, we also propose a feature refinement module by integrating an attention mechanism to highlight the salient feature information, thus improving the performance of the segmentation network. Additionally, consistency regularization is introduced to exploit the rich information that unlabeled data contain, thus aiding the semi-supervised learning. To validate the effectiveness of the proposed method, extensive experiments were conducted on the Flame dataset and Corsican dataset. The experimental results show that the proposed model outperforms state-of-the-art methods and is competitive to its fully supervised learning counterpart

    A segmentation-guided deep learning framework for leaf counting

    Get PDF
    Deep learning-based methods have recently provided a means to rapidly and effectively extract various plant traits due to their powerful ability to depict a plant image across a variety of species and growth conditions. In this paper, we focus on dealing with two fundamental tasks in plant phenotyping, i.e., plant segmentation and leaf counting, and propose a two-steam deep learning framework for segmenting plants and counting leaves with various size and shape from two-dimensional plant images. In the first stream, a multi-scale segmentation model using spatial pyramid is developed to extract leaves with different size and shape, where the fine-grained details of leaves are captured using deep feature extractor. In the second stream, a regression counting model is proposed to estimate the number of leaves without any pre-detection, where an auxiliary binary mask from segmentation stream is introduced to enhance the counting performance by effectively alleviating the influence of complex background. Extensive pot experiments are conducted on the CVPPP 2017 Leaf Counting Challenge dataset, which contains images of Arabidopsis and tobacco plants. Experimental results demonstrate that the proposed framework achieves a promising performance both in plant segmentation and leaf counting, providing a reference for the automatic analysis of plant phenotypes
    corecore